Indexing and Querying Constantly Evolving Data Using Time Series Analysis

نویسندگان

  • Yuni Xia
  • Sunil Prabhakar
  • Jianzhong Sun
  • Shan Lei
چکیده

This paper introduces a new approach for efficiently indexing and querying constantly evolving data. Traditional data index structures suffer from frequent updating cost and result in unsatisfactory performance when data changes constantly. Existing approaches try to reduce index updating cost by using a simple linear or recursive function to define the data evolution, however, in many applications, the data evolution is far too complex to be accurately described by a simple function. We propose to take each constantly evolving data as a time series and use the ARIMA (Autoregressive Integrated Moving Average) methodology to analyze and model it. The model enables making effective forecasts for the data. The index is developed based on the forecasting intervals. As long as the data changes within its corresponding forecasting interval, only its current value in the leaf node needs to be updated and no further update needs to be done to the index structure. The model parameters and the index structure can be dynamically adjusted. Experiments show that the forecasting interval index (FI-Index) significantly outperforms traditional indexes in a high updating environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Signature Files for Querying Time - SeriesDataHenrik Andr e - J onsson

This paper describes our work on a new automatic indexing technique for large one-dimensional (1D) or time-series data. The principal idea of the proposed time-series data indexing method is to encode the shape of time-series into an alphabet of characters and then to treat them as text. As far as we know this is a novel approach to 1D data indexing. In this paper we report our results in using...

متن کامل

Change Tolerant Indexing for Constantly Evolving Data Technical Report CSD TR # 04 - 006

Index structures are designed to optimize search performance, while at the same time supporting efficient data updates. Although not explicit, existing index structures are typically based upon the assumption that the rate of updates will be small compared to the rate of querying. This assumption is not valid in streaming data environments such as sensor and moving object databases, where updat...

متن کامل

RINSE: Interactive Data Series Exploration with ADS+

Numerous applications continuously produce big amounts of data series, and in several time critical scenarios analysts need to be able to query these data as soon as they become available. An adaptive index data structure, ADS+, which is specifically tailored to solve the problem of indexing and querying very large data series collections has been recently proposed as a solution to this problem...

متن کامل

Querying of Time Series for Big Data Analytics

Time series data emerge naturally in many fields of applied sciences and engineering including but not limited to statistics, signal processing, mathematical finance, weather and power consumption forecasting. Although time series data have been well studied in the past, they still present a challenge to the scientific community. Advanced operations such as classification, segmentation, predict...

متن کامل

Algorithms for Segmenting Time Series

As with most computer science problems, representation of the data is the key to ecient and eective solutions. Piecewise linear representation has been used for the representation of the data. This representation has been used by various researchers to support clustering, classication, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005